Angle Calculation
I was pulling the wrong coefficients from the LDA, which is why the angles were in the wrong range last time.
I am now pulling the right coefficients, but not matching the published angles for 9 subjects. I think this is a problem of not having all of the data for subjects that were run multiple times.
All the figures and tables in google slides are updated.
Deploying to tablets
- got a tablet to test on and things downloaded
- still need to test
Predicting Weighting Class
Defining the input features
- corner accuracy: how well you heard the corner data (e.g. Stimuli1=5, Stimuli2=1, you respond ‘BAH’)
- semi-corner accuracy: how well you perform on the “inner ring” stimuli (e.g. Stimuli_1=4, Stimuli_2=2, you respond ‘BAH’ or Stimuli_1=4 and Stimuli_2=1, you respond ‘BAH’)
- edge accuracy: whether or not you were correct in an unambiguous dimension (e.g. Stimuli_1=5, Stimuli_2=3, you respond ‘BAH’)
- demographics: STMDetectionThresholdL, STMDetectionThresholdR, STMSensitivityL, STMSensitivityR, C_Tpracticerounds, AgeCT, R3FPTA, L3FPTA
I chose the demographics based on what looked like a hearing test and not an aggregate of some other feature, but picking better demographics is something we should talk about.
No demographic information available for 28, 53, 246, 374, 514, 533, or 555.
Clustering
We did some automatic clustering to define Cue categories based on the weighting angle. The method that showed most consistent values within clusters was a 3-way classification of Temporal, Spectral, Neither.
Ways I played with the algorithm:
- number clusters
- distance measures euclidean
- distance threshold 100
- variance minimizations ward (minimize in group variance)
Input features:
- corner, semi-corner and edge accuracy counts (6)
- all prediction type-response triplets (100)
- final angle (1)
- corner, semi-corner, edge, and angle (7)
Measures in red were used in the final clussterer.
Random Forest Classifier
When trained on corner+semi-corner+edge+angle features, perfectly predicts held-out class labels (train/test acc 100%) for full Without angle as input, train acc 0.6 test 0.57
Behavior of the different LSTMs
We have been looking at ways of predicting a coarse-grain category for angles using statistical and neural methods.
Plot the various models.
Base, Step 10
The subjects who are taking longer to converge:
Base, Corner, & Predicted LDA, Step 10
The subjects who are taking longer to converge:
Base, Corner, Predicted LDA, & Demographics, Step 10
The subjects who are taking longer to converge:
Base, Co srner, LDA & Demographics, Step 1
---
title: "Updates to Cue Profile Analysis"
output: html_notebook
---

```{r,echo=FALSE,results='hide',warning=FALSE}
library(plotly)
library(ggplot2)
library(RColorBrewer)
library(patchwork)
```

## Angle Calculation

I was pulling the wrong coefficients from the LDA, which is why the angles were in the wrong range last time.  

I am now pulling the right coefficients, but not matching the published angles for 9 subjects.  I think this is a problem of not having all of the data for subjects that were run multiple times.

All the figures and tables in google slides are updated.

## Deploying to tablets

* got a tablet to test on and things downloaded
* still need to test

# Predicting Weighting Class

Defining the input features

* _corner accuracy_:  how well you heard the corner data (e.g. Stimuli1=5, Stimuli2=1, you respond 'BAH')
* _semi-corner accuracy_:  how well you perform on the "inner ring" stimuli (e.g. Stimuli_1=4, Stimuli_2=2, you respond 'BAH' or Stimuli_1=4 and Stimuli_2=1, you respond 'BAH')
* _edge accuracy_:  whether or not you were correct in an unambiguous dimension (e.g. Stimuli_1=5, Stimuli_2=3, you respond 'BAH')
* _demographics_:  STMDetectionThresholdL, STMDetectionThresholdR, STMSensitivityL, STMSensitivityR, C_Tpracticerounds, AgeCT, R3FPTA, L3FPTA

I chose the demographics based on what looked like a hearing test and not an aggregate of some other feature, but picking better demographics is something we should talk about.

No demographic information available for 28, 53, 246, 374, 514, 533, or 555.

## Clustering

We did some automatic clustering to define Cue categories based on the weighting angle.  The method that showed most consistent values within clusters was a 3-way classification of Temporal, Spectral, Neither.

Ways I played with the algorithm:

* number clusters
* distance measures <span style="color:red">euclidean</span>
* distance threshold <span style="color:red">100</span>
* variance minimizations <span style="color:red">ward (minimize in group variance)</span>

Input features:

* corner, semi-corner and edge accuracy counts (6)
* all prediction type-response triplets (100)
* <span style="color:red">final angle (1)</span>
* corner, semi-corner, edge, and angle (7)

Measures in red were used in the final clussterer.

## Random Forest Classifier

When trained on corner+semi-corner+edge+angle features, perfectly predicts held-out class labels (train/test acc 100%) for full
Without angle as input, train acc 0.6  test 0.57

## Behavior of the different LSTMs

We have been looking at ways of predicting a coarse-grain category for angles using statistical and neural methods.

```{r,echo=FALSE}
# LSTM results
data.frame("input feats" = c("base", "base", "base + corner",
                           "base + corner + LDA", "base + corner + LDA + demographics",
                           "base + corner", "base + corner"),
           "step size" = c(25,10,10,10,10,5,1),
           "train accuracy" = c("88%", "830/835 (99%)", "835/835 (100%)", "835/835 (100%)",
                              "819/835 (98%)", "1647/1647 (100%)", "8143/8143 (100%)"),
           "test accuracy" = c("90%", "100%", "100%", "100%", "99%", "100%", "100%"))

# load the angle data so i can refer back to it
angle_data = read.csv('/Users/sara/Documents/research/nih/comparing-error-rates/nih/angle.tsv',sep='\t')
has_demo = function(subj){
  if (subj %in% no.demo){
    return("TRUE")
  } else {
    return("FALSE")
  }
}
no.demo <- c(28, 53, 246, 374, 514, 533, 555)
angle_data$Missing.Demo <- unlist(lapply(angle_data$Subj,has_demo))
```

```{r,echo=FALSE}
make_index <- function(ind){
    return(as.numeric(substr(ind,2,nchar(as.character(ind)))))
  }
pull_table <- function(path){
  data <- read.csv(path)
  data <- cbind(data[1],stack(data[2:ncol(data)]))
  data$ind = lapply(data$ind,make_index)
  return(as.data.frame(lapply(data,unlist)))
}
```

Plot the various models.
```{r, echo=FALSE}
getPalette <- colorRampPalette(brewer.pal(8, "Set2"))
make_plot = function(data,ylab='Probability of correct class',title='Base + Corners, Step 10'){
  p <- ggplot(data=data, aes(x=ind, y=values,color=as.factor(subj))) +
    geom_line() +
    geom_point()+
    scale_color_manual(name='Subject',values = getPalette(29))+
    scale_x_discrete(name='Number of trials in test sequence')+
    scale_y_continuous(name=ylab)+
    ggtitle(title)
  return(ggplotly(p))
}
```

## Base, Step 25
```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/base_25_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/base_25_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base, Step 25') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base, Step 25')
```

## Base, Step 10
```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/base_10_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/base_10_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base, Step 10') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base, Step 10')
```
The subjects who are taking longer to converge:
```{r,echo=FALSE}
bad_subjects = c(536,525,537,556,529)
filter(angle_data,Subj %in% bad_subjects)
```

## Base & Corner, Step 10

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corners_10_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corners_10_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base & Corner, Step 10') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base & Corner, Step 10')
```

## Base, Corner, & Predicted LDA, Step 10

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corners_and_lda_10_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corners_and_lda_10_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base, Corner & LDA, Step 10') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base, Corner & LDA, Step 10')
```
The subjects who are taking longer to converge:
```{r,echo=FALSE}
bad_subjects = c(571,542,52,47,516,510)
filter(angle_data,Subj %in% bad_subjects)
```
## Base, Corner, Predicted LDA, & Demographics, Step 10

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_lda_demo_10_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_lda_demo_10_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base, Corner, LDA, Demographics, Step 10') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base, Corner, LDA, Demographics, Step 10')

```
The subjects who are taking longer to converge:
```{r,echo=FALSE}
bad_subjects = c(272,47,537,576,510,525,542)
filter(angle_data,Subj %in% bad_subjects)
```
## Base and Corner, Step 5

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_5_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_5_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base & Corner, Step 5') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base & Corner, Step 5')
```

## Base and Corner, Step 1

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_1_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_1_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base & Corner, Step 1') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base & Corner, Step 1')
```


# Base, Co srner, LDA & Demographics, Step 1

```{r,echo=FALSE}
good_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_lda_demo_1_rnn_probs.csv'
bad_probs <- '/Users/sara/Documents/research/nih/comparing-error-rates/lstm_results/corner_lda_demo_1_rnn_bad_probs.csv'
good_data <- pull_table(good_probs)
bad_data <- pull_table(bad_probs)
make_plot(good_data,title='Base, Corner, LDA, Demographics, Step 1') 
make_plot(bad_data,ylab='Highest Probability of incorrect class',title='Base, Corner, LDA, Demographics, Step 1')

```